Search CORE

41 research outputs found

On structural properties of the value function for an unbounded jump Markov process with an application to a processor sharing retrial queue

Author: A Hordijk
A. C. Brooms
DG Down
F. M. Spieksma
GM Koole
LI Sennott
RB Lund
RF Serfozo
S Lippman
S. Bhulai
X Guo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

The derivation of structural properties for unbounded jump Markov processes cannot be done using standard mathematical tools, since the analysis is hindered due to the fact that the system is not uniformizable. We present a promising technique, a smoothed rate truncation method, to overcome the limitations of standard techniques and allow for the derivation of structural properties. We introduce this technique by application to a processor sharing queue with impatient customers that can retry if they renege. We are interested in structural properties of the value function of the system as a function of the arrival rate

Crossref

VU Research Portal

Birkbeck Institutional Research Online

Parameter-Independent Strategies for pMDPs via POMDPs

Author: A Lukina
C Baier
C Baier
C Daws
C Dehnert
C Dehnert
D Beyer
E Bartocci
E Polgreen
EM Hahn
EM Hahn
J Aspnes
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
LI Sennott
M Baldi
M Cubuktepe
M Kwiatkowska
MTJ Spaan
N Jansen
O Madani
PR Halmos
R Lanotte
S Pathak
S Russell
T Quatmann
V Kreinovich
Publication venue
Publication date: 01/01/2018
Field of study

Markov Decision Processes (MDPs) are a popular class of models suitable for solving control decision problems in probabilistic reactive systems. We consider parametric MDPs (pMDPs) that include parameters in some of the transition probabilities to account for stochastic uncertainties of the environment such as noise or input disturbances. We study pMDPs with reachability objectives where the parameter values are unknown and impossible to measure directly during execution, but there is a probability distribution known over the parameter values. We study for the first time computing parameter-independent strategies that are expectation optimal, i.e., optimize the expected reachability probability under the probability distribution over the parameters. We present an encoding of our problem to partially observable MDPs (POMDPs), i.e., a reduction of our problem to computing optimal strategies in POMDPs. We evaluate our method experimentally on several benchmarks: a motivating (repeated) learner model; a series of benchmarks of varying configurations of a robot moving on a grid; and a consensus protocol.Comment: Extended version of a QEST 2018 pape

arXiv.org e-Print Archive

Crossref

Publikationsserver der RWTH Aachen University

IST Austria: PubRep (Institute of Science and Technology)

Policy learning for time-bounded reachability in Continuous-Time Markov Decision Processes via doubly-stochastic gradient ascent

Author: A Bianco
B Miller
C Baier
C Baier
C Baier
C Lefevre
CE Rasmussen
DT Gillespie
E Bartocci
HLS Younes
J Baxter
L Bortolussi
L Bortolussi
L Bortolussi
L Bottou
LI Sennott
M Kwiatkowska
MN Rabe
MN Rabe
Q Qiu
SK Jha
T Henzinger
X Guo
Y Butkova
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Continuous-time Markov decision processes are an important class of models in a wide range of applications, ranging from cyber-physical systems to synthetic biology. A central problem is how to devise a policy to control the system in order to maximise the probability of satisfying a set of temporal logic specifications. Here we present a novel approach based on statistical model checking and an unbiased estimation of a functional gradient in the space of possible policies. The statistical approach has several advantages over conventional approaches based on uniformisation, as it can also be applied when the model is replaced by a black box, and does not suffer from state-space explosion. The use of a stochastic gradient to guide our search considerably improves the efficiency of learning policies. We demonstrate the method on a proof-of-principle non-linear population model, showing strong performance in a non-trivial task

Archivio istituzionale della ricerca - Università di Trieste

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

SOJOURN TIMES IN NON-HOMOGENEOUS QBD PROCESSES WITH PROCESSOR SHARING

Crossref

N—Person Stochastic Games: Extensions of the Finite State Space Case and Correlation

Author: A Federgruen
A Jaskiewicz
AS Nowak
AS Nowak
AS Nowak
AS Nowak
AS Nowak
AS Nowak
AS Nowak
AS Nowak
AS Nowak
AS Nowak
C Castaing
C Harris
CJ Himmelberg
CJ Himmelberg
CJ Himmelberg
CJ Himmelberg
D Duffle
DP Bertsekas
E Altman
E Altman
E Altman
E Solan
F Forges
FM Spieksma
H-U Kiienle
H-U Küenle
IE Glicksberg
J-F Mertens
J-F Mertens
J-F Mertens
K Kuratowski
LD Brown
LI Sennott
LO Curtat
N Dunford
O Hernández-Lerma
O Passchier
P Billingsley
PK Dutta
R Amir
SP Meyn
SP Meyn
T Parthasarathy
T Parthasarathy
TB Bielecki
U Rieder
VS Borkar
W Whitt
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

In this chapter, we present a framework for m-person stochastic games with an infinite state space. Our main purpose is to present a correlated equilibrium theorem proved by Nowak and Raghavan [42] for discounted stochastic games with a measurable state space, where the correlation o

CiteSeerX

Crossref

Nonzero-sum Stochastic Games

Author: A Bensoussan
A Bensoussan
A Federgruen
A Hordijk
A Maitra
A Maitra
A Majumdar
A Mucci
A Shiryaev
AM Fink
AS Nowak
AS Nowak
AS Nowak
AS Nowak
AS Nowak
AS Nowak
AS Nowak
AS Nowak
AS Nowak
AS Nowak
AS Nowak
C Castaing
C Harris
CJ Himmelberg
CJ Himmelberg
D Duffle
DP Bertsekas
E Altman
E Altman
E Dynkin
E Dynkin
E Enns
E Enns
E Ferenstein
E Ferenstein
E Frid
E Nummelin
E Solan
EG Enns
F Forges
F Thuijsman
F Thuijsman
FM Spieksma
G Haggstrom
G Ravindran
H Moulin
H-U Küenle
HW Kuhn
I Karatzas
J Georgin
J Gilbert
J Lepeltier
J Neveu
J Neveu
J Rose
J-F Mertens
J-F Mertens
J-M Bismut
JA Filar
JL Doob
K Ano
K Szajowski
K Szajowski
K Szajowski
K Szajowski
K Szajowski
K Yamada
L Stettner
LI Sennott
LO Curtat
LS Shapley
M Breton
M Fushimi
M Kurano
M Majumdar
M Sakaguchi
M Sakaguchi
M Schäl
M Sobel
M Takahashi
M Yasuda
MK Ghosh
N Elbakidze
N Krylov
N Vieille
N Vieille
N Vieille
O Hernández-Lerma
O Hernández-Lerma
O Passchier
OJ Vrieze
P Freeman
PD Rogers
PK Dutta
PK Dutta
R Eidukjavicjus
R Luce
S Yakovitz
SP Meyn
SP Meyn
T Ferguson
T Parthasarathy
T Parthasarathy
T Parthasarathy
T Parthasarathy
T Radzik
T Ueno
TES Raghavan
U Rieder
U Rieder
U Rieder
V Borkar
W Stadje
W Whitt
Y Kifer
Y Ohtsubo
Publication venue
Publication date: 01/01/1998
Field of study

This paper treats of stochastic games. We focus on nonzero-sum games and provide a detailed survey of selected recent results. In Section 1, we consider stochastic Markov games. A correlation of strategies of the players, involving ``public signals'', is described, and a correlated equilibrium theorem proved recently by Nowak and Raghavan for discounted stochastic games with general state space is presented. We also report an extension of this result to a class of undiscounted stochastic games, satisfying some uniform ergodicity condition. Stopping games are related to stochastic Markov games. In Section 2, we describe a version of Dynkin's game related to observation of a Markov process with random assignment mechanism of states to the players. Some recent contributions of the second author in this area are reported. The paper also contains a brief overview of the theory of nonzero-sum stochastic games and stopping games which is very far from being complete

Munich RePEc Personal Archive

Crossref

Discrete-time zero-sum Markov games with first passage criteria

Author: Federgruen A
Qiuli Liu
Sennott LI
Xiangxiang Huang
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

A periodic review production and maintenance model with random demand, deteriorating equipment, and binomial yield

Author: Derman C
Heyman DP
Puterman ML
Sennott LI
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Approximate Linear Programming for Average Cost MDPs

Author: Bertsekas DP
de Farias DP
Michael H. Veatch
Sennott LI
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date
Field of study

Crossref

Characterizing optimal empty container reposition policy in periodic-review shuttle service systems

Author: Bertsekas DP
D-P Song
Hall RW
Puterman ML
Sennott LI
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref